ABSTRACT
During the outbreak of the COVID-19 pandemic, many people shared their symptoms across Online Social Networks (OSNs) like Twitter, hoping for others' advice or moral support. Prior studies have shown that those who disclose health-related information across OSNs often tend to regret it and delete their publications afterwards. Hence, deleted posts containing sensitive data can be seen as manifestations of online regrets. In this work, we present an analysis of deleted content on Twitter during the outbreak of the COVID-19 pandemic. For this, we collected more than 3.67 million tweets describing COVID-19 symptoms (e.g., fever, cough, and fatigue) posted between January and April 2020. We observed that around 24% of the tweets containing personal pronouns were deleted either by their authors or by the platform after one year. As a practical application of the resulting dataset, we explored its suitability for the automatic classification of regrettable content on Twitter. © 2023 Owner/Author.
ABSTRACT
We describe the fifth edition of the CheckThat! lab, part of the 2022 Conference and Labs of the Evaluation Forum (CLEF). The lab evaluates technology supporting tasks related to factuality in multiple languages: Arabic, Bulgarian, Dutch, English, German, Spanish, and Turkish. Task 1 asks to identify relevant claims in tweets in terms of check-worthiness, verifiability, harmfullness, and attention-worthiness. Task 2 asks to detect previously fact-checked claims that could be relevant to fact-check a new claim. It targets both tweets and political debates/speeches. Task 3 asks to predict the veracity of the main claim in a news article. CheckThat! was the most popular lab at CLEF-2022 in terms of team registrations: 137 teams. More than one-third (37%) of them actually participated: 18, 7, and 26 teams submitted 210, 37, and 126 official runs for tasks 1, 2, and 3, respectively.
ABSTRACT
Social media has become popular among users for social interaction and news sources. Users spread misinformation in multiple data formats. However, systematic studying of social media phenomena has been challenging due to the lack of labelled data. This paper presents a semi-automated annotation framework AMUSED for gathering multilingual multimodal annotated data from social networking sites. The framework is designed to mitigate the workload in collecting and annotating social media data by cohesively combining machines and humans in the data collection process. AMUSED detects links to social media posts from a given list of news articles and then downloads the data from the respective social networking sites and labels them. The framework gathers the annotated data from multiple platforms like Twitter, YouTube, and Reddit. For the use case, we have implemented the framework for collecting COVID-19 misinformation data from different social media sites and have categorised 8,077 fact-checked articles into four different classes of misinformation. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
ABSTRACT
The fifth edition of the CheckThat! Lab is held as part of the 2022 Conference and Labs of the Evaluation Forum (CLEF). The lab evaluates technology supporting various factuality tasks in seven languages: Arabic, Bulgarian, Dutch, English, German, Spanish, and Turkish. Task 1 focuses on disinformation related to the ongoing COVID-19 infodemic and politics, and asks to predict whether a tweet is worth fact-checking, contains a verifiable factual claim, is harmful to the society, or is of interest to policy makers and why. Task 2 asks to retrieve claims that have been previously fact-checked and that could be useful to verify the claim in a tweet. Task 3 is to predict the veracity of a news article. Tasks 1 and 3 are classification problems, while Task 2 is a ranking one.
ABSTRACT
We describe the fourth edition of the CheckThat! Lab, part of the 2021 Conference and Labs of the Evaluation Forum (CLEF). The lab evaluates technology supporting tasks related to factuality, and covers Arabic, Bulgarian, English, Spanish, and Turkish. Task 1 asks to predict which posts in a Twitter stream are worth fact-checking, focusing on COVID-19 and politics (in all five languages). Task 2 asks to determine whether a claim in a tweet can be verified using a set of previously fact-checked claims (in Arabic and English). Task 3 asks to predict the veracity of a news article and its topical domain (in English). The evaluation is based on mean average precision or precision at rank k for the ranking tasks, and macro-F1 for the classification tasks. This was the most popular CLEF-2021 lab in terms of team registrations: 132 teams. Nearly one-third of them participated: 15, 5, and 25 teams submitted official runs for tasks 1, 2, and 3, respectively. © 2021, Springer Nature Switzerland AG.
ABSTRACT
We describe the fourth edition of the CheckThat! Lab, part of the 2021 Cross-Language Evaluation Forum (CLEF). The lab evaluates technology supporting various tasks related to factuality, and it is offered in Arabic, Bulgarian, English, and Spanish. Task 1 asks to predict which tweets in a Twitter stream are worth fact-checking (focusing on COVID-19). Task 2 asks to determine whether a claim in a tweet can be verified using a set of previously fact-checked claims. Task 3 asks to predict the veracity of a target news article and its topical domain. The evaluation is carried out using mean average precision or precision at rank k for the ranking tasks, and F1 for the classification tasks. © 2021, Springer Nature Switzerland AG.